The simplified partial digest problem: Approximation and a graph-theoretic model
نویسندگان
چکیده
The goal of the Simplified Partial Digest Problem (SPDP) is motivated by the reconstruction of the linear structure of a DNA chain with respect to a given nucleotide pattern, based on the multiset of distances between the adjacent patterns (interpoint distances) and the multiset of distances between each pattern and the two unlabeled endpoints of the DNA chain (end distances). We consider optimization versions of the problem, called SPDP-Min and SPDPMax. The aim of SPDP-Min (SPDP-Max) is to find a DNA linear structure with the same multiset of end distances and the minimum (maximum) number of incorrect (correct) interpoint distances. Results are presented on the worst-case efficiency of approximation algorithms for these problems. We suggest a graph-theoretic model for SPDP-Min and SPDP-Max, which can be used to reduce the search space for an optimal solution in either of these problems. We also present heuristic polynomial time algorithms based on this model. In computational experiments with randomly generated and real-life input data, our best algorithm delivered an optimal solution in 100% of the instances for a number of restriction sites not greater than 50. This work was supported by the Marie Curie BIOPTRAIN fellowship of Alexandr Kovalev, EPSRC grant GR/S64530/01 and the Ministry of Science of Poland grant N N519 314635. ∗Corresponding author Email addresses: [email protected] (Jacek Blazewicz), [email protected] (Edmund K. Burke), [email protected] (Marta Kasprzak), [email protected] (Alexandr Kovalev), [email protected] (Mikhail Y. Kovalyov) Preprint submitted to Elsevier July 27, 2010
منابع مشابه
A Continuous Optimization Model for Partial Digest Problem
The pupose of this paper is modeling of Partial Digest Problem (PDP) as a mathematical programming problem. In this paper we present a new viewpoint of PDP. We formulate the PDP as a continuous optimization problem and develope a method to solve this problem. Finally we constract a linear programming model for the problem with an additional constraint. This later model can be solved by the simp...
متن کاملModeling of Partial Digest Problem as a Network flows problem
Restriction Site Mapping is one of the interesting tasks in Computational Biology. A DNA strand can be thought of as a string on the letters A, T, C, and G. When a particular restriction enzyme is added to a DNA solution, the DNA is cut at particular restriction sites. The goal of the restriction site mapping is to determine the location of every site for a given enzyme. In partial digest metho...
متن کاملModeling thermodynamic properties of electrolytes: Inclusion of the mean spherical approximation (MSA) in the simplified SAFT equation of state
In this work, an equation of state has been utilized for thermodynamic modeling of aqueous electrolyte solutions. The proposed equation of state is a combination of simplified statistical associating fluid theory (SAFT) equation of state (similar to simplified PC-SAFT) to describe the effect of short-range interactions and mean spherical approximation (MSA) term to describe the effect of long-r...
متن کاملThe Simplified Partial Digest Problem: Enumerative and Dynamic Programming Algorithms
We study the Simplified Partial Digest Problem (SPDP), which is a mathematical model for a new simplified partial digest method of genome mapping. This method is easy for laboratory implementation and robust with respect to the experimental errors. SPDP is NP-hard in the strong sense. We present an Oðn2Þ time enumerative algorithm (ENUM) and an OðnÞ time dynamic programming algorithm for the er...
متن کاملNoisy Data Make the Partial Digest Problem NP-hard
The problem to find the coordinates of n points on a line such that the pairwise distances of the points form a given multi-set of n 2 distances is known as Partial Digest problem, which occurs for instance in DNA physical mapping and de novo sequencing of proteins. Although Partial Digest was – as a combinatorial problem – already proposed in the 1930’s, its computational complexity is still u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- European Journal of Operational Research
دوره 208 شماره
صفحات -
تاریخ انتشار 2011